Goto

Collaborating Authors

 Tarapacá Region


MindSet: Vision. A toolbox for testing DNNs on key psychological experiments

Biscione, Valerio, Yin, Dong, Malhotra, Gaurav, Dujmovic, Marin, Montero, Milton L., Puebla, Guillermo, Adolfi, Federico, Heaton, Rachel F., Hummel, John E., Evans, Benjamin D., Habashy, Karim, Bowers, Jeffrey S.

arXiv.org Artificial Intelligence

Multiple benchmarks have been developed to assess the alignment between deep neural networks (DNNs) and human vision. In almost all cases these benchmarks are observational in the sense they are composed of behavioural and brain responses to naturalistic images that have not been manipulated to test hypotheses regarding how DNNs or humans perceive and identify objects. Here we introduce the toolbox MindSet: Vision, consisting of a collection of image datasets and related scripts designed to test DNNs on 30 psychological findings. In all experimental conditions, the stimuli are systematically manipulated to test specific hypotheses regarding human visual perception and object recognition. In addition to providing pre-generated datasets of images, we provide code to regenerate these datasets, offering many configurable parameters which greatly extend the dataset versatility for different research contexts, and code to facilitate the testing of DNNs on these image datasets using three different methods (similarity judgments, out-ofdistribution classification, and decoder method), accessible at https://github.


MOReGIn: Multi-Objective Recommendation at the Global and Individual Levels

Gómez, Elizabeth, Contreras, David, Boratto, Ludovico, Salamó, Maria

arXiv.org Artificial Intelligence

Multi-Objective Recommender Systems (MORSs) emerged as a paradigm to guarantee multiple (often conflicting) goals. Besides accuracy, a MORS can operate at the global level, where additional beyond-accuracy goals are met for the system as a whole, or at the individual level, meaning that the recommendations are tailored to the needs of each user. The state-of-the-art MORSs either operate at the global or individual level, without assuming the co-existence of the two perspectives. In this study, we show that when global and individual objectives co-exist, MORSs are not able to meet both types of goals. To overcome this issue, we present an approach that regulates the recommendation lists so as to guarantee both global and individual perspectives, while preserving its effectiveness. Specifically, as individual perspective, we tackle genre calibration and, as global perspective, provider fairness. We validate our approach on two real-world datasets, publicly released with this paper.


DCR-Consistency: Divide-Conquer-Reasoning for Consistency Evaluation and Improvement of Large Language Models

Cui, Wendi, Zhang, Jiaxin, Li, Zhuohang, Damien, Lopez, Das, Kamalika, Malin, Bradley, Kumar, Sricharan

arXiv.org Artificial Intelligence

Evaluating the quality and variability of text generated by Large Language Models (LLMs) poses a significant, yet unresolved research challenge. Traditional evaluation methods, such as ROUGE and BERTScore, which measure token similarity, often fail to capture the holistic semantic equivalence. This results in a low correlation with human judgments and intuition, which is especially problematic in high-stakes applications like healthcare and finance where reliability, safety, and robust decision-making are highly critical. This work proposes DCR, an automated framework for evaluating and improving the consistency of LLM-generated texts using a divide-conquer-reasoning approach. Unlike existing LLM-based evaluators that operate at the paragraph level, our method employs a divide-and-conquer evaluator (DCE) that breaks down the paragraph-to-paragraph comparison between two generated responses into individual sentence-to-paragraph comparisons, each evaluated based on predefined criteria. To facilitate this approach, we introduce an automatic metric converter (AMC) that translates the output from DCE into an interpretable numeric score. Beyond the consistency evaluation, we further present a reason-assisted improver (RAI) that leverages the analytical reasons with explanations identified by DCE to generate new responses aimed at reducing these inconsistencies. Through comprehensive and systematic empirical analysis, we show that our approach outperforms state-of-the-art methods by a large margin (e.g., +19.3% and +24.3% on the SummEval dataset) in evaluating the consistency of LLM generation across multiple benchmarks in semantic, factual, and summarization consistency tasks. Our approach also substantially reduces nearly 90% of output inconsistencies, showing promise for effective hallucination mitigation.


Understanding the oceans and climate change – the OcéanIA project and Tara expedition

AIHub

Researchers on the OcéanIA project are developing new artificial intelligence and mathematical modelling tools to contribute to the understanding of the oceans and their role in regulating and sustaining the biosphere, and tackling climate change. You may have seen our recent interview with the director of the project, and of Inria Chile, Nayat Sánchez-Pi. She explained the challenges of research in the field, what they are working on as part of the project, and the role that AI methods play. A key part of the project is data, and much of this is being collected by the Tara Microbiome-CEODOS expedition. The objective of this expedition is to study the marine microorganisms which play a fundamental role in ocean ecosystems.


Interview with Nayat Sánchez-Pi – how the OcéanIA project is advancing our understanding of the oceans and our climate

AIHub

Nayat Sánchez-Pi is the Director of the Inria Chile Research Center. We asked her about her research and about the OcéanIA project which she leads. The aim of the OcéanIA project is to develop new artificial intelligence and mathematical modeling tools to contribute to the understanding of the oceans and their role in regulating and sustaining the biosphere, and tackling the climate change. I have been working in the area of artificial intelligence and machine learning for more than 15 years now. During this time I have always had an interest in finding ways of taking the state-of-the-art of my area of research and applying it to have a direct impact on the real world.


Knowledge Graphs

Hogan, Aidan, Blomqvist, Eva, Cochez, Michael, d'Amato, Claudia, de Melo, Gerard, Gutierrez, Claudio, Gayo, José Emilio Labra, Kirrane, Sabrina, Neumaier, Sebastian, Polleres, Axel, Navigli, Roberto, Ngomo, Axel-Cyrille Ngonga, Rashid, Sabbir M., Rula, Anisa, Schmelzeisen, Lukas, Sequeda, Juan, Staab, Steffen, Zimmermann, Antoine

arXiv.org Artificial Intelligence

In this paper we provide a comprehensive introduction to knowledge graphs, which have recently garnered significant attention from both industry and academia in scenarios that require exploiting diverse, dynamic, large-scale collections of data. After a general introduction, we motivate and contrast various graph-based data models and query languages that are used for knowledge graphs. We discuss the roles of schema, identity, and context in knowledge graphs. We explain how knowledge can be represented and extracted using a combination of deductive and inductive techniques. We summarise methods for the creation, enrichment, quality assessment, refinement, and publication of knowledge graphs. We provide an overview of prominent open knowledge graphs and enterprise knowledge graphs, their applications, and how they use the aforementioned techniques. We conclude with high-level future research directions for knowledge graphs.